Skip to content

Inline filtering for vector sets#1890

Open
metajack wants to merge 7 commits into
mainfrom
push-zyqmnwkmszpk
Open

Inline filtering for vector sets#1890
metajack wants to merge 7 commits into
mainfrom
push-zyqmnwkmszpk

Conversation

@metajack

Copy link
Copy Markdown

Based on Haiyang's original two queue work, this rebased onto the quantization branch and modifies it for the new inline filtering with adaptive L.

@metajack metajack force-pushed the push-zyqmnwkmszpk branch from c42adbb to 64a83a7 Compare June 22, 2026 14:50
@metajack metajack force-pushed the push-zyqmnwkmszpk branch from 64a83a7 to 56d244c Compare June 22, 2026 21:04
@metajack metajack marked this pull request as ready for review June 23, 2026 20:34
Copilot AI review requested due to automatic review settings June 23, 2026 20:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Garnet’s vector set similarity search (VSIM) to support inline filtering via a DiskANN → C# callback (instead of relying on over-retrieval + post-filtering), and updates the public docs/tests around filtering and FILTER-EF.

Changes:

  • Adds an unmanaged inline-filter callback wiring from DiskANN (Rust) into Garnet (C#) and threads per-query compiled filter state via [ThreadStatic].
  • Changes FILTER-EF semantics/limits (default 16, range [4, 256]) and updates validation + tests accordingly.
  • Introduces a documented binary attribute encoding/extraction path intended to accelerate filter evaluation, and adds a new design doc describing the end-to-end approach.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
website/docs/dev/filtered-search-design.md New end-to-end design doc for filtered vector search and inline filtering.
website/docs/commands/vector-sets.md Updates VSIM option docs for FILTER-EF and inline filtering behavior.
test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs Adds/updates VSIM filter validation tests and new “bad filter” cases.
test/standalone/Garnet.test.extensions/DiskANN/DiskANNServiceTests.cs Updates DiskANN index creation tests for the new callback parameter.
libs/server/Resp/Vector/VectorManager.Migration.cs Passes the new inline filter callback when recreating indexes.
libs/server/Resp/Vector/VectorManager.Locking.cs Passes the new inline filter callback when creating/recreating indexes.
libs/server/Resp/Vector/VectorManager.Filter.cs Adds thread-static inline filter state + candidate evaluation logic.
libs/server/Resp/Vector/VectorManager.cs Switches VSIM paths toward inline filtering setup and bitmap sizing helper.
libs/server/Resp/Vector/VectorManager.Callbacks.cs Adds the unmanaged callback entrypoint to call into filter evaluation.
libs/server/Resp/Vector/VectorFilterExpression.cs Simplifies ExprProgram by removing redundant length fields.
libs/server/Resp/Vector/RespServerSessionVectors.cs Updates FILTER-EF parsing/validation defaults and bounds.
libs/server/Resp/Vector/ExprRunner.cs Iterates using program.Instructions.Length instead of removed program.Length.
libs/server/Resp/Vector/DiskANNService.cs Extends create/recreate index P/Invoke signature to include filter callback.
libs/server/Resp/Vector/AttributeExtractor.cs Adds binary attribute conversion/extraction APIs and minor JSON parsing cleanup.
Directory.Packages.props Bumps diskann-garnet package version.

Comment on lines 780 to 786
// Apply post-filtering if filter is specified
if (!filter.IsEmpty)
{
// Ensure bitmap is large enough for the over-retrieved result set
var requiredBitmapBytes = (found + 7) >> 3;
if (requiredBitmapBytes > filterBitmap.Length)
{
if (!filterBitmap.IsSpanByte)
{
filterBitmap.Memory.Dispose();
}

filterBitmap = new SpanByteAndMemory(MemoryPool<byte>.Shared.Rent(requiredBitmapBytes), requiredBitmapBytes);
}
EnsureFilterBitmapSize(ref filterBitmap, found);

_ = ApplyPostFilter(filter, found, outputAttributes.ReadOnlySpan, filterBitmap.Span, ActiveThreadSession.scratchBufferBuilder);
}
Comment on lines +861 to +868
var instrCount = ExprCompiler.TryCompile(filter, instrBuf, tuplePoolBuf, tokensBuf, opsStackBuf, out var tupleCount, out _);
if (instrCount < 0)
{
outputDistances.Length = 0;
filterBitmap.Length = 0;
outputIdFormat = VectorIdFormat.I32LengthPrefixed;
return VectorManagerResult.OK;
}
Comment on lines +276 to +306
// 1. Read external ID for this internal_id via ExtMap
Span<byte> iidKey = stackalloc byte[sizeof(uint)];
BinaryPrimitives.WriteUInt32LittleEndian(iidKey, internalId);

Span<byte> eidBuf = stackalloc byte[128];
var eidMem = SpanByteAndMemory.FromPinnedSpan(eidBuf);
try
{
if (!ReadSizeUnknown(context | DiskANNService.ExternalIdMap, true, iidKey, ref eidMem))
return 0; // can't find external ID → exclude

// 2. Read attributes by external ID
Span<byte> attrBuf = stackalloc byte[256];
var attrMem = SpanByteAndMemory.FromPinnedSpan(attrBuf);
try
{
if (!ReadSizeUnknown(context | DiskANNService.Attributes, true, eidMem.ReadOnlySpan, ref attrMem))
return 0; // no attributes → exclude

// 3. Rebuild ExprProgram from thread-static state pointers
var program = new ExprProgram
{
Instructions = state.InstrBuf,
TuplePool = state.TuplePoolBuf,
RuntimePool = state.RuntimePoolBuf,
RuntimePoolLength = 0,
};

program.ResetRuntimePool();

AttributeExtractor.ExtractFields(attrMem.ReadOnlySpan, state.FilterBytes, state.SelectorRanges, state.ExtractedFields, ref program);
Comment on lines +565 to +569
output[pos] = 8;
output[pos + 1] = 0;
pos += 2;
_ = BitConverter.TryWriteBytes(output[pos..], numVal);
pos += 8;
Comment on lines +684 to +688
if (valueLen == 8)
{
var numVal = System.BitConverter.ToDouble(binary[pos..]);
results[matchIndex] = ExprToken.NewNum(numVal);
}

This would reduce the per-candidate cost to **zero extra store reads** — the only remaining overhead is the binary field scan and expression evaluation.

### Further with attibute index: Pre-built attribute index to replace per-candidate filter evaluation
Comment on lines +34 to +43
### Solution: Add a second attribute store optimized for query-time filter evaluation

The current change **adds a new attribute store** alongside the existing one. The two stores serve different purposes:

| Store | Keyed by | Format | Purpose |
|-------|----------|--------|---------|
| Existing | External ID (user key) | Raw JSON | RESP command operations (`VGETATTR`, `VSETATTR`, etc.) |
| **New** | Internal ID (DiskANN ID) | Binary | Inline filter evaluation at query time |

The existing external ID keyed JSON store is untouched — it continues to serve all RESP command operations. The new internal ID keyed binary store is a **write-time derived projection** of the same data, optimized purely for the inline filter callback's access pattern.
Comment on lines 946 to 952
// Apply post-filtering if filter is specified
if (!filter.IsEmpty)
{
// Ensure bitmap is large enough for the over-retrieved result set
var requiredBitmapBytes = (found + 7) >> 3;
if (requiredBitmapBytes > filterBitmap.Length)
{
if (!filterBitmap.IsSpanByte)
{
filterBitmap.Memory.Dispose();
}

filterBitmap = new SpanByteAndMemory(MemoryPool<byte>.Shared.Rent(requiredBitmapBytes), requiredBitmapBytes);
}
EnsureFilterBitmapSize(ref filterBitmap, found);

_ = ApplyPostFilter(filter, found, outputAttributes.ReadOnlySpan, filterBitmap.Span, ActiveThreadSession.scratchBufferBuilder);
}
Comment on lines 384 to +385
| `FILTER expr` | _none_ | Post-filter results by an attribute expression (see [Filter Expressions](#filter-expressions)). |
| `FILTER-EF n` | `min(COUNT * 200, 100000000)` | Maximum number of nearest neighbors to **inspect** before filtering. Must be in `[0, 100000000]`. |
| `FILTER-EF n` | `16` | Scale factor for adaptive inline filter search. Must be in `[4, 256]`. This controls how high the EF will scale based on selectivity. |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants